53 research outputs found

    Some highlights on Source-to-Source Adjoint AD

    Get PDF
    International audienceAlgorithmic Differentiation (AD) provides the analytic derivatives of functions given as programs. Adjoint AD, which computes gradients, is similar to Back Propagation for Machine Learning. AD researchers study strategies to overcome the difficulties of adjoint AD, to get closer to its theoretical efficiency. To promote fruitful exchanges between Back Propagation and adjoint AD, we present three of these strategies and give our view of their interest and current status

    TAPENADE 2.1 user's guide

    Get PDF
    This is the user's manual for the version 2.1 of the Automatic Differentiation tool TAPENADE. Given a source computer program that computes a differentiable mathematical function FF, TAPENADE builds a new source program that computes some of the derivatives of FF, specifically directional derivatives ("tangent mode") and gradients ("reverse mode"). This report summarizes the mathematical justifications of Automatic Differentiation, then describes in full detail the differentiation model that TAPENADE implements. Our goal is to give the users of TAPENADE a precise understanding of the actions and choices made while differentiating programs, so as to improve their confidence in the produced source programs. This report documents all the available options and parameterizations that the users can give to TAPENADE, and conversely all the diagnosis and requirements that TAPENADE may issue towards the users. After a brief description of TAPENADE's architecture and performances, this report describes more fully the validation and improvement techniques for differentiated codes

    On the correct application of AD checkpointing to adjoint MPI-parallel programs

    Get PDF
    International audienceCheckpointing is a classical technique to mitigate the overhead of adjoint Al-gorithmic Differentiation (AD). In the context of source transformation AD with the Store-All approach, checkpointing reduces the peak memory consumption of the adjoint, at the cost of duplicate runs of selected pieces of the code. Checkpointing is vital for long run-time codes, which is the case of most MPI parallel applications. However, the presence of MPI communications seriously restricts application of checkpointing. In most attempts to apply checkpointing to adjoint MPI codes (the " popular " approach), a number of restrictions apply on the form of communications that occur in the checkpointed piece of code. In many works, these restrictions are not explicit, and an application that does not respect these restrictions may produce erroneous code. We propose techniques to apply checkpointing to adjoint MPI codes, that either do not suppose these restrictions, or explicit them so that the end users can verify their applicability. These techniques rely on both adapting the snapshot mechanism of checkpointing and on modifying the behavior of communication calls. One technique is based on logging the values received, so that the duplicated communications need not take place. Although this technique completely lifts restrictions on checkpointing MPI codes, message logging makes it more costly than the popular approach. However, we can refine this technique to blend message logging and communications duplication whenever it is possible, so that the refined technique now encompasses the popular approach. We provide elements of proof of correction of our refined technique, i.e. that it preserves the semantics of the adjoint code and that it doesn't introduce deadlocks

    Source-transformation Differentiation of a C++-like Ice Sheet model

    Get PDF
    International audienceAlgorithmic Differentiation (AD) has become one of the most powerful tools to improve our understanding of theEarth System. If AD has been used by the ocean and atmospheric circulation modelingcommunity for almost 20 years, it is relatively new in the ice sheet modeling community. The Ice SheetSystem Model (ISSM) is a C++, object-oriented, massively parallelized, new generation ice sheet model that recentlyimplemented AD to improve its data assimilation capabilities. ISSM currently relies on Object Overloading throughADOL-C and AMPI. However, experience shows that Object Overloading AD on ISSM is significantly more memoryintensive compared to the primal code. We want to investigate other AD approaches to improve the performance ofthe AD adjoint. Yet, to our knowledge, there is no source-to-source AD tool that supports C++.To overcome this problem, we have developed a prototype of ISSM entirely in C, called Boreas, in order to testsource-to-source transformation and compare the performance of these two approaches to AD. Boreas is a clone ofISSM, the main difference with ISSM is that all the objects are converted to C-structures and some function nameshave been adapted in order to be unique, but the code architectures are identical. The programming style of Boreas isa first attempt at defining a programming style of (or a sub-language of) C++ that source-transformation AD couldhandle. We present here the first results of Source-Transformation AD of Boread with the AD tool Tapenade

    Automatic Differentiation of Parallel Loops with Formal Methods

    Get PDF
    International audienceThis paper presents a novel combination of reverse mode automatic differentiation and formal methods, to enable efficient differentiation of (or backpropagation through) shared-memory parallel loops. Compared to the state of the art, our approach can reduce the need for atomic updates or private data copies during the parallel derivative computation, even in the presence of unstructured or data-dependent data access patterns. This is achieved by gathering information about the memory access patterns from the input program, which is assumed to be correctly parallelized. This information is then used to build a model of assertions in a theorem prover, which can be used to check the safety of shared memory accesses during the parallel derivative loops. We demonstrate this approach on scientific computing benchmarks including a lattice-Boltzmann method (LBM) solver from the Parboil benchmark suite and a Green's function Monte Carlo (GFMC) kernel from the CORAL benchmark suite

    Experiments on Checkpointing Adjoint MPI Programs

    Get PDF
    International audienceCheckpointing is a classical strategy to reduce the peak memory consumption of the adjoint. Checkpointing is vital for long run-time codes, which is the case of most MPI parallel applications. However, for MPI codes this question has always been addressed by ad-hoc hand manipulations of the differentiated code, and with no formal assurance of correctness. In a previous work, we investigated the assumptions implicitly made during past experiments, to clarify and generalize them. On one hand we proposed an adaptation of checkpointing to the case of MPI parallel programs with point-to-point communications, so that the semantics of an adjoint program is preserved for any choice of the checkpointed part. On the other hand, we proposed an alternative adaptation of checkpointing, more efficient but that requires a number of restrictions on the choice of the checkpointed part. In this work we see checkpointing MPI parallel programs from a practical point of view. We propose an implementation of the adapted techniques inside the AMPI library. We discuss practical questions about the choice of technique to be applied within a checkpointed part and the choice of the checkpointed part itself. Finally, we validate our theoretical results on representative CFD codes

    Source-to-Source Automatic Differentiation of OpenMP Parallel Loops

    Get PDF
    International audienceThis paper presents our work toward correct and efficient automatic differentiation of OpenMP parallel worksharing loops in forward and reverse mode. Automatic differentiation is a method to obtain gradients of numerical programs, which are crucial in optimization, uncertainty quantification, and machine learning. The computational cost to compute gradients is a common bottleneck in practice. For applications that are parallelized for multicore CPUs or GPUs using OpenMP, one also wishes to compute the gradients in parallel. We propose a framework to reason about the correctness of the generated derivative code, from which we justify our OpenMP extension to the differentiation model. We implement this model in the automatic differentiation tool Tapenade and present test cases that are differentiated following our extended differentiation procedure. Performance of the generated derivative programs in forward and reverse mode is better than sequential, although our reverse mode often scales worse than the input programs

    Mixed-language Automatic Differentiation

    Get PDF
    International audienceAs AD usage is spreading to larger and more sophisticated applications, problems arise for codes that use several programming languages. Many AD tools have been designed with one application language in mind. Only a few use an internal representation that promotes language-independence, at least conceptually. When faced with the problem of building (with AD) the derivative code of a mixed-language application, end-users may consider using several AD tools, one per language. However, this leads to several problems: • Different AD tools may implemented very different AD models such as overloading-based versus source-transformation based or association-by-address versus association-by-name. These models are often not compatible. • When selecting the source-transformation model (for efficiency of the differentiated code), performance of the differentiated code strongly depends on the quality of data-flow analysis, which must be global on the code. A global analysis with separate AD tools would require inter-tool communication at the level of data-flow analysis, which does not exist at present. In any case, interoperable data-flow analysis between tools imply that the tools share their analysis strategy, which is almost never the case. Consequently, we think the only viable approach is to use a single tool, with a single internal representation and data-flow analysis strategy, therefore converting each source file to this unique representation regardless of its original language. It turns out that Tapenade [1] provides such an internal representation, accessible at present from C or Fortran sources. Other AD tools provide a language-independent internal representation. OpenAD provides such a representation based on the XAIF formalism. However, this gives birth to two separate tools, OpenAD/F for Fortran, and ADIC2 for C. Still, it seems that their is no deep reason to prevent OpenAD application to mixed-language codes. We are lacking information about common architecture between TAF and TAC++ that would allow such mixed-language AD. Rapsodia [2] was the first AD tool to support algorithmic differentiation in tangent mode of mixed-language components, specifically C++ and Fortran. As Rapsodia uses operator overloading, it performs no global analysis of the code. To our knowledge the extension of mixed-language differentiation with Rapsodia to adjoint mode is not yet provided

    The Tapenade Automatic Differentiation tool: principles, model, and specification

    Get PDF
    Tapenade is an Automatic Differentiation tool which, given a Fortran or C code that computes a function, creates a new code that computes its tangent or adjoint derivatives. Tapenade puts particular emphasis on adjoint differentiation, which computes gradients at a remarkably low cost. This paper describes the principles of Tapenade, a subset of the general principles of AD. We motivate and illustrate on examples the AD model of Tapenade, i.e. the structure of differentiated codes and the strategies used to make them more efficient. Along with this informal description, we formally specify this model by means of Data-Flow Equations and rules of Operational Semantics, making this the reference specification of the tangent and adjoint modes of Tapenade. One benefit we expect from this formal specification is the capacity to study formally the AD model itself, especially for the adjoint mode and its sophisticated strategies. This paper also describes the architectural choices of the implementation of Tapenade. We describe the current performances of Tapenade on a set of codes that include industrial-size applications. We present the extensions of the tool that are planned in a foreseeable future, deriving from our ongoing research on AD.Tapenade est un outil de Différentiation Automatique (DA) qui, étant donné un code Fortran ou C calculant une fonction, crée un nouveau code qui calcule ses dérivées tangente ou adjointe. Tapenade porte une attention particuliére á la différentiation adjointe, qui calcule des gradients trés efficacement. Nous décrivons les principes de DA utiles pour comprendre Tapenade. Nous motivons les choix qui nous guident dans son développement. Nous illustrons sur des exemples courts le modéle de différentiation choisi et les stratégies pour produire un code différentié efficace. Aprés cette description intuitive, nous donnons une spécification formelle de Tapenade au moyen d'équations data-flow et de régles de Sémantique Opérationnelle. Cette formalisation peut servir de base á des preuves de correction, principalement pour le mode adjoint. Nous décrivons enfin l'architecture de Tapenade, ses structures de données principales, et nous présentons ses performances sur des applications de taille industrielle. En conclusion, nous présentons les recherches en cours ou prévues

    Source-to-Source Automatic Differentiation of OpenMP Parallel Loops

    Get PDF
    International audienceThis paper presents our work toward correct and efficient automatic differentiation of OpenMP parallel worksharing loops in forward and reverse mode. Automatic differentiation is a method to obtain gradients of numerical programs, which are crucial in optimization, uncertainty quantification, and machine learning. The computational cost to compute gradients is a common bottleneck in practice. For applications that are parallelized for multicore CPUs or GPUs using OpenMP, one also wishes to compute the gradients in parallel. We propose a framework to reason about the correctness of the generated derivative code, from which we justify our OpenMP extension to the differentiation model. We implement this model in the automatic differentiation tool Tapenade and present test cases that are differentiated following our extended differentiation procedure. Performance of the generated derivative programs in forward and reverse mode is better than sequential, although our reverse mode often scales worse than the input programs
    corecore